DISSERTATION An Evaluation of the UCPOP Planning Algorithm Operating in a Dynamic Environment as Implemented on a Sony Playstation Platform As part of the course requirement for the Master of Science Degree in Computing for Commerce and Industry Author: Brian Warner ID: P2256553 Date: 22 September 2001 Word Count: 14,599 Preface My thanks are due to Dr Alan Harrison, my supervisor, for guidance, and to Rebecca for continued support throughout the duration of the M801 Project. Contents TOC \o "1-3" \h \z 1 Introduction PAGEREF _Toc525974402 \h 10 1.1 Outline of the Problem Being Investigated PAGEREF _Toc525974403 \h 10 1.2 Aims of the Project PAGEREF _Toc525974404 \h 11 1.3 Chapter Overview PAGEREF _Toc525974405 \h 12 2 Artificial Intelligence Planning in Dynamic Environments PAGEREF _Toc525974406 \h 13 2.1 Introduction PAGEREF _Toc525974407 \h 13 2.2 Classical Planning Assumptions PAGEREF _Toc525974408 \h 13 2.3 Planning in Dynamic Environments PAGEREF _Toc525974409 \h 14 2.4 Summary PAGEREF _Toc525974410 \h 16 3 Selection of the Planning Algorithm PAGEREF _Toc525974411 \h 17 3.1 Introduction PAGEREF _Toc525974412 \h 17 3.2 Requirements of the Planner PAGEREF _Toc525974413 \h 17 3.2.1 Interleaving Planning, Execution and Monitoring PAGEREF _Toc525974414 \h 18 3.2.2 Representation of Time PAGEREF _Toc525974415 \h 20 3.3 Comparison of Planner Types PAGEREF _Toc525974416 \h 20 3.4 UCPOP PAGEREF _Toc525974417 \h 23 3.4.1 Heuristic Enhancements PAGEREF _Toc525974418 \h 24 3.5 Summary PAGEREF _Toc525974419 \h 25 4 Evaluation System Requirements PAGEREF _Toc525974420 \h 26 4.1 Introduction PAGEREF _Toc525974421 \h 26 4.2 Requirements of the Dynamic Environment PAGEREF _Toc525974422 \h 26 4.3 Requirements of the Evaluation System PAGEREF _Toc525974423 \h 27 4.4 Summary PAGEREF _Toc525974424 \h 28 5 Evaluation System Design PAGEREF _Toc525974425 \h 29 5.1 Introduction PAGEREF _Toc525974426 \h 29 5.2 Design of the Dynamic Environment PAGEREF _Toc525974427 \h 30 5.2.1 Representation of Time PAGEREF _Toc525974428 \h 30 5.2.2 Environment Objects and Actions PAGEREF _Toc525974429 \h 31 5.2.3 Formal Representation in PDDL PAGEREF _Toc525974430 \h 34 5.3 Design of the Evaluation System PAGEREF _Toc525974431 \h 35 5.3.1 Evaluation System Structure PAGEREF _Toc525974432 \h 36 5.3.2 Design of the Search Algorithm PAGEREF _Toc525974433 \h 39 5.3.3 Design of the Planning Algorithm PAGEREF _Toc525974434 \h 42 5.4 Summary PAGEREF _Toc525974435 \h 46 6 Definition of the Test Scenarios PAGEREF _Toc525974436 \h 47 6.1 Introduction PAGEREF _Toc525974437 \h 47 6.2 Parameters PAGEREF _Toc525974438 \h 47 6.3 Collection of Metric Information PAGEREF _Toc525974439 \h 48 6.4 Test Scenarios PAGEREF _Toc525974440 \h 49 6.4.1 Environment Complexity PAGEREF _Toc525974441 \h 49 6.4.2 Rate of Dynamic Change PAGEREF _Toc525974442 \h 49 6.4.3 Agent Start Location PAGEREF _Toc525974443 \h 50 6.4.4 Constants PAGEREF _Toc525974444 \h 50 6.5 Summary PAGEREF _Toc525974445 \h 50 7 Implementation of the Evaluation System PAGEREF _Toc525974446 \h 51 7.1 Introduction PAGEREF _Toc525974447 \h 51 7.2 Implementation Platform PAGEREF _Toc525974448 \h 51 7.3 Evaluation System PAGEREF _Toc525974449 \h 52 7.3.1 Initialisation PAGEREF _Toc525974450 \h 54 7.3.2 Main Loop PAGEREF _Toc525974451 \h 54 7.3.3 Close Down PAGEREF _Toc525974452 \h 54 7.4 Dynamic Environment PAGEREF _Toc525974453 \h 55 7.4.1 Environment Objects PAGEREF _Toc525974454 \h 55 7.4.2 Dynamic Events PAGEREF _Toc525974455 \h 55 7.4.3 Goal Preconditions And Action Effects PAGEREF _Toc525974456 \h 55 7.5 Planning Algorithm PAGEREF _Toc525974457 \h 57 7.5.1 UCPOP MGU Implementation PAGEREF _Toc525974458 \h 58 7.6 Search Algorithm PAGEREF _Toc525974459 \h 60 7.6.1 Generation of the Search Tree PAGEREF _Toc525974460 \h 61 7.6.2 Route Selection PAGEREF _Toc525974461 \h 62 7.6.3 Identification of Route Preconditions PAGEREF _Toc525974462 \h 62 7.7 Plan Execution and Monitoring PAGEREF _Toc525974463 \h 62 7.7.1 Execute Plan PAGEREF _Toc525974464 \h 63 7.7.2 Select the Next Action PAGEREF _Toc525974465 \h 63 7.8 Test Scenarios PAGEREF _Toc525974466 \h 64 7.9 Summary PAGEREF _Toc525974467 \h 64 8 Results of the Execution of the Test Scenarios PAGEREF _Toc525974468 \h 65 8.1 Introduction PAGEREF _Toc525974469 \h 65 8.2 Method of Execution PAGEREF _Toc525974470 \h 65 8.3 Results Obtained PAGEREF _Toc525974471 \h 66 8.4 Summary PAGEREF _Toc525974472 \h 66 9 Discussion of Test Results PAGEREF _Toc525974473 \h 67 9.1 Analysis of the Test Results PAGEREF _Toc525974474 \h 67 9.1.1 Environment Complexity PAGEREF _Toc525974475 \h 67 9.1.2 Rate of Dynamic Change PAGEREF _Toc525974476 \h 70 9.1.3 Agent Starting Location PAGEREF _Toc525974477 \h 71 9.2 Generalisation PAGEREF _Toc525974478 \h 71 10 Conclusions PAGEREF _Toc525974479 \h 73 Appendix A: Glossary PAGEREF _Toc525974480 \h 75 Appendix B: Design Documents PAGEREF _Toc525974481 \h 76 Appendix C: Implementation Documents PAGEREF _Toc525974482 \h 77 Appendix D: Test Scenario Definition PAGEREF _Toc525974483 \h 78 Appendix E: Test Results PAGEREF _Toc525974484 \h 82 Appendix F: Diskette Contents PAGEREF _Toc525974485 \h 83 References PAGEREF _Toc525974486 \h 84 Index PAGEREF _Toc525974487 \h 88 List of Tables TOC \h \z \c "Table" Table 5-1 Environment Terrain Types PAGEREF _Toc525974360 \h 31 Table 5-2 Environment Objects PAGEREF _Toc525974361 \h 32 Table 5-3 Actions that can be Applied to the Environment PAGEREF _Toc525974362 \h 33 Table 5-4 Search Tree Node Structure PAGEREF _Toc525974363 \h 42 Table 5-5 Goal Node Structure PAGEREF _Toc525974364 \h 42 Table 5-6 Mapping UCPOP to the Logical Plan Class Diagram PAGEREF _Toc525974365 \h 44 Table 5-7 Operator Dependencies PAGEREF _Toc525974366 \h 45 Table 6-1 Metrics to be Gathered About the Environment PAGEREF _Toc525974367 \h 48 Table 6-2 Metrics Gathered During Test Scenario Execution PAGEREF _Toc525974368 \h 48 Table 6-3 Scenario Complexity Calculation and Weightings PAGEREF _Toc525974369 \h 49 Table 7-1 Common Condition Settings for Preconditions and Effects PAGEREF _Toc525974370 \h 56 List of Figures TOC \h \z \c "Figure" Figure 5-1 Environment Objects Class Diagram PAGEREF _Toc525974371 \h 31 Figure 5-2 PDDL Representation of Environment Objects and Actions PAGEREF _Toc525974372 \h 35 Figure 5-3 Process Model for Plan/Execute/Monitor Cycle PAGEREF _Toc525974373 \h 36 Figure 5-4 Pseudo Code for Dynamic Events PAGEREF _Toc525974374 \h 37 Figure 5-5 Agent State Transition Diagram PAGEREF _Toc525974375 \h 38 Figure 5-6 Search Tree PAGEREF _Toc525974376 \h 40 Figure 5-7 Search Tree Pseudo Code PAGEREF _Toc525974377 \h 41 Figure 5-8 UCPOP Algorithm, Reproduced with Permission PAGEREF _Toc525974378 \h 43 Figure 5-9 Logical Plan Class Diagram PAGEREF _Toc525974379 \h 44 Figure 6-1 Scenario Complexity Graph PAGEREF _Toc525974380 \h 49 Figure 7-1 Application Development Process PAGEREF _Toc525974381 \h 51 Figure 7-2 Application Execution on the Sony Playstation PAGEREF _Toc525974382 \h 52 Figure 7-3 Functional Structure of the Application Prototype PAGEREF _Toc525974383 \h 53 Figure 7-4 Source File Structure PAGEREF _Toc525974384 \h 53 Figure 7-5 C Structures for Environment Objects PAGEREF _Toc525974385 \h 55 Figure 7-6 C Structures for Goal Preconditions and Action Effects PAGEREF _Toc525974386 \h 56 Figure 7-7 Structure of Plan Creation Functions PAGEREF _Toc525974387 \h 57 Figure 7-8 Plan Entity Relationship Diagram PAGEREF _Toc525974388 \h 57 Figure 7-9 Plan Data Structures PAGEREF _Toc525974389 \h 58 Figure 7-10 Example Plan PAGEREF _Toc525974390 \h 59 Figure 7-11 Search Functions Structure PAGEREF _Toc525974391 \h 60 Figure 7-12 Search Entity Relationship Diagram PAGEREF _Toc525974392 \h 60 Figure 7-13 Search C Structures PAGEREF _Toc525974393 \h 61 Figure 7-14 Structure of Execute and Monitor Functions PAGEREF _Toc525974394 \h 62 Figure 7-15 Plan Data Structure PAGEREF _Toc525974395 \h 63 Figure 9-1 Number Search Nodes Examined by Scenario PAGEREF _Toc525974396 \h 67 Figure 9-2 Plan, Preconditions, Action Effects and Orderings by Complexity PAGEREF _Toc525974397 \h 68 Figure 9-3 Preconditions Met, Failed and Un-attempted by Complexity PAGEREF _Toc525974398 \h 69 Figure 9-4 Searches Performed and Search Nodes by Complexity PAGEREF _Toc525974399 \h 69 Figure 9-5 Mean Results for Dynamic Rate of Change PAGEREF _Toc525974400 \h 70 Figure 9-6 Mean Results by Agent Start Location PAGEREF _Toc525974401 \h 71 Abstract This dissertation evaluates the UCPOP Artificial Intelligence (AI) Planning algorithm to see whether it performs satisfactorily against a dynamic environment. It covers Classical AI Planning assumptions and describes how they must be removed to allow practical application against complex, real-world environments. A brief comparison is given of the differing planning approaches, and the reasons for the choice of UCPOP are documented. Also discussed is the role of heuristics in increasing performance and the importance of time. A design and implementation of the evaluation system is provided. It covers the planning algorithm, functionality to execute and monitor the progress of plans, and the simulated dynamic environment itself. Results obtained from the execution of varied test scenarios indicate that the algorithm performance decreases at a less than linear rate as complexity is increased. Additional data is required before analytical tools and techniques could be applied to prove an exponential relationship. The dissertation concludes with a review of the project and suggestions for further research. Introduction Outline of the Problem Being Investigated Artificial Intelligence (AI) Planning creates a plan XE "plan" , consisting of a set of action XE "action" s, which achieves a goal XE "goal" when applied to an environment XE "environment" . The AI Planner XE "planner:inputs" takes as inputs to the process the goals of an Age XE "agent:goals" nt, the set of available actions that can be applied, and the current state of the environment against which the plan will be executed – Weld (1994). Research into Classical AI Plan XE "plan:classical" ning started in the 1960s and was constrained by a number of assumptions. It was assumed that the Agent was the only source of actions that could change the environment and that the results of action application were always determinate. There was also no concept of the passage of time XE "time" . The processes of generating a plan XE "plan:generation" and of applying actions to change the environment were seen as instantaneous, making real-world application difficult. The process of generating a plan is actually the combination of a planning component and two other interdependent elements, search XE "search:algorithm" algorithms and knowledge representation XE "knowledge representation" . McDermott and Hendler (1995) state that search algorithms have been crucial to planning from the start and remain so today. A knowledge representation language is also required to allow the precise state of the environment to be defined, how actions change the environment and what preconditions must be satisfied before an action can be applied. The interdependence of the three elements is emphasised by Ginsberg (1993) who says that:- “The intended role of knowledge representation in artificial intelligence is to reduce problems of intelligent action to search problems.” The early assumptions made, have limited the real-world application of planning systems. McCarthy and Pollack (2000) highlight the fact that most of the research has been at a theoretical level as real-world environment XE "environment:real-world" s invariably break Classical assumptions. Therefore, for AI Planning to be more practical and more applicable to real-world situations, there was a need to remove the assumptions. Planners working in dynamic environment XE "environment:dynamic" s need to deal with changes that have not been initiated by the Agent. They need to have a temporal representation of action XE "action" s so that events, and the planning process itself, are not executed instantaneously. They also need to discard the assumption of determinate actions to be able to deal with situations, in which actions sometimes fail to have the expected effect. This project aims to assess whether an AI Planner can work satisfactorily in a dynamic environment where Classical assumptions have been removed and in which it is therefore necessary to not only plan XE "plan:execution" , but also to execute and monitor results. A Sony Playstation XE "Sony Playstation" will provide the real-world platform onto which the planner and the dynamic environment will be implemented. Aims of the Project The aims of the project are to:- Design, using a suitable methodology, an implementation of Penberthy & Weld’s (1992) UCPOP XE "UCPOP" planner with execution and monitoring functionality. Design a dynamic test environment using a suitable knowledge representation XE "knowledge representation" technique. Construct varied test scenarios from which data can be gathered for analysis. Implement the AI Planner, the dynamic environment and the test scenarios on the practical implementation platform provided by the Sony Playstation. Execute the test scenarios to gather data about the performance of the planner against each of the test scenarios. Analyse the data to assess whether the planner does work satisfactorily in the dynamic environment XE "environment:dynamic" . Chapter Overview Chapter 2 describes AI Planning, the assumptions made in Classical Planning, and the different considerations when planning in dynamic environments. Chapter 3 covers the requirements of a planner operating in a dynamic environment, the different types of planners that were considered and the reasons for the selection of UCPOP as the basis of the planning algorithm being evaluated. Chapter 4 documents the requirements of the dynamic environment and the system that will allow the planning algorithm to be evaluated. Chapters 5 and 6 cover the design of the system. Chapter 5 documents the design of the dynamic environment, the planning algorithm, the search algorithms and the evaluation system. Chapter 6 identifies the test parameters that will be varied within the dynamic environment, the metric information I intend to collect for later analysis, and the definition of each specific test scenario. Chapter 7 takes the output from the design stage of the project and the defined test scenarios and describes the implementation on the Sony Playstation XE "Sony Playstation" . Chapters 8 and 9 cover the execution of the test scenarios and a discussion of the results obtained. Chapter 8 provides a description of the methods used to conduct the tests and a presentation of the results. Chapter 9 presents a quantitative analysis of the planner performance against the test scenarios, generalisation of the results presented in the previous section, and comments on experiment noise. Chapter 10 provides a review of the project as a whole, assessing whether the objectives set out at the start have been met. Artificial Intelligence Planning in Dynamic Environments Introduction This chapter describes the Classical AI Planning assumptions that have limited real-world application, and the different considerations that need to be made when planning in dynamic environment XE "environment:dynamic" s. It shows that, for dynamic environments, planning and execution are inseparable as Knoblock (1996) states. It also re-enforces the adoption of Knoblock’s (1996) definition of planning as the attainment of goals through successful plan execution and not just plan generation. Classical Plan XE "plan:classical" ning Assumptions McDermott and Hendler’s (1995) history of AI Planning shows that Classical AI Planning began in the 1960s and centred on the application of search techniques and theorem proving. Most influential of the planners constructed at this time was Fikes and Nilsson's (1972) STRIPS. STRIPS endeavours to find a combination of operators that turn the initial state of an environment into the goal XE "goal" state. The effects of operators are held as two lists. One list adds to the state of the environment, whilst the other is deleted from it. Planning problems get computationally very difficult and search XE "search:space" space intensive very quickly – Weld (1994). This is because AI Planning belongs to the ‘synthesis’ group of problems, the search space for which Ginsberg (1993) suggests is much larger than classification problems. Both McCarthy & Pollack (2000) and Weld (1994) detail the Classical assumptions initially adopted to simplify planning problems. The assumptions are that:- The only changes made to the environment are as a result of the actions applied by the Agent. The environment XE "environment:static" is therefore static. The Agent XE "agent:complete knowledge XE "complete knowledge" " also has complete knowledge about the environment. There is no uncertainty XE "uncertainty" . The results of actions applied by the Agent are determinate. The Agent will know the outcome of any action that it applies and, as Hanks et al. (1993) highlight, the planner will know ahead of time whether a plan will succeed. Actions are applied instantaneously. There is no concept of action XE "action:non-instantaneous" s taking time XE "time" to apply. Planning in Dynamic Environments In the forward looking section of his influential paper Chapman (1987) calls for the removal of Classical assumptions to allow planning to be applicable to real-world, dynamic environments. Having just presented formal proof for his Partial Order Plan XE "plan:classical" ner XE "planner:partial order" TWEAK, he goes on to say that it is next to useless in a practical sense, highlighting the simplifications under which AI Planning had been working until that time. This view is re-enforced by McCarthy and Pollack (2000) who plainly state that real-world environments break Classical planning assumptions. Research into planning in dynamic environments has been undertaken from the mid-1980s – McDermott and Hendler (1995) – and has focused on the attempt to cast off Classical constraints. Knoblock (1996) lists the assumptions that need to be challenged, the most important of which to this project are that:- Environments are not deterministic, and the Agent does not solely initiate actions that cause changes to it. Environments are dynamic. The application of action effects to the environment is not instantaneous. Action XE "action:non-instantaneous" s take time XE "time" to apply, and time is a crucial resource that must be considered. The results of Agent actions are not deterministic and the planner will need to monitor the execution of the plan XE "plan:execution" for failures. To simplify the project being undertaken the Agen XE "agent:complete knowledge XE "complete knowledge" " t will retain complete information about the environment. The project will not deal with uncertainty XE "uncertainty" . Aylett et al. (2000) provide a comparison of the progress made in the practical application of AI Planning with that of Knowledge Based Systems. They note that whilst Knowledge Based Systems have many real-world implementations and general-purpose implementation techniques, AI Planning is seen as mostly preoccupied with theory. There are notable exceptions. Aylett et al. (2000) cite military application from Desert Storm and Weld (1999) highlights Williams and Nayak’s spacecraft control system as an indication that AI Planning is showing increased maturity. Punch et al. (1996) also describe the practical use of Knowledge Based GTM (Generic Task Model) techniques to implement AI Planning in preference to extending Classical planning systems such as Fikes and Nilsson’s (1971) STRIPS. Practical AI Planners have to deal with fundamental changes to Classical planning assumptions. They need not just to create plan XE "plan:classical" s to achieve goals, but also to execute those plans in the dynamic environment XE "environment:dynamic" and monitor the results of applying actions to check for unexpected failure. Punch et al. (1996) note that dynamic environments may change independently of Agent actions. McCarthy and Pollack (2000) take this one step further and state that dynamic environments may even change whilst the plan XE "plan:dynamic" ner is constructing a plan. This highlights the fact that successful execution of the plan cannot be guaranteed in the complexities and uncertainties of real-world environments. AI Planners working in dynamic environments therefore need to be able to recover from the failure to apply the effects of an action and to re-plan when necessary. Knoblock (1996) suggests that to consider the execution of a plan XE "plan:execution" as an integral part of the planning process challenges Classical assumptions and ultimately leads to more practical planners. He also goes on to assert that the purpose of planning is not the actual plan but the attainment of goal XE "goal" s through the successful execution of the plan. This definition emphasises the dependence between plan generation and plan XE "plan:generation" execution in dynamic environments and is the definition that has been adopted by this project. The evaluation system that the project aims to build will therefore have to adopt the considerations for planning in dynamic environments that remove Classical limitations. It also needs to find a methodology on which to base the design and implementation. Summary This chapter has shown that the project must remove Classical planning assumptions to build a planner that can execute plans in a dynamic environment. It has also stated the adoption of Knoblock's (1996) definition of planning as the attainment of goals through plan execution. Selection of the Planning Algorithm Introduction Weld (1999) highlights the dramatic advances in AI Planning technology during the second half of the 1990s. Research was focused on propositional planner XE "planner:propositional" s such as Blum and Furst’s (1997) Graphplan XE "graphplan" because of the huge performance gains they introduced. Practical implementation of the plan/execute/monitor cycle also matured with the development of applications such as the spacecraft control system aboard NASA’s Deep Space One. This chapter sets out the requirements for a planner operating in a dynamic environment XE "environment:dynamic" that will need to be met by the design and implementation stages of this project. It also discusses the different planning approaches that were considered as the basis of the planning algorithm, and sets out the reasons for the choice of Penberthy and Weld’s (1992) UCPOP XE "UCPOP" . Requirements of the Planner There are a number of characteristics that the planner XE "planner:requirements" must exhibit. Some of the characteristics are common to all planners, others are specific to planners working in dynamic environments. The plan XE "plan:complete" ner should exhibit Completeness. If a solution to the planning problem exists, then the planner will be able to find it. Equally, as pointed out by Russell & Norvig (1995), the planner should be able to detect where no solution exists. The plan XE "plan:sound" ner should also exhibit Soundness. If the planner returns a solution plan, then the plan will be correct. The planner must have a suitably expressive knowledge representation XE "knowledge representation" language to be able to accurately model the dynamic environment, action precondition XE "precondition" s and effects, and Agent goals. Weld (1994) observes that the more expressive the language the planning algorithm handles, the greater the degradation in performance. The knowledge representation language will therefore have a bearing on the overall ability of the planner to work satisfactorily in a dynamic environment. The planner needs to be able to deal with the fact that changes will occur within the environment that are not as the result of Agent actions. Russell & Norvig (1995) and Haigh & Veloso (1998) summarise a number of approaches to dealing with these unforeseen eventualities. Conditional planning generates a plan for all eventualities. Probabilistic planning refines the approach by generating plans for all likely events and relies on re-planning for less likely outcomes. Both of these approaches quickly become unmanageable and the project will therefore progress an approach of execution monitoring to ascertain when a re-plan XE "plan:failure" is required. The interleaving of planning, execution and monitoring is discussed in more detail below. The planner must also have a temporal representation to allow non-instantaneous action XE "action:non-instantaneous" effects. Again, the representation of time is discussed in more detail further on. The overall objective of the project is to assess whether a planner can operate satisfactorily in a dynamic environment XE "environment:dynamic" . To achieve practical performance it may be necessary to trade off desirable attributes such as Completeness and find a balance between planning, executing and monitoring. Pollack and Ringuette (1990) discuss the trade off between performance and accuracy faced by the Agent in their Tileworld dynamic test bed. The trade off decisions for this project will be made in Chapter 5 as part of the design process. Interleaving Planning, Execution and Monitoring Classical plan XE "plan:classical" ning assumptions allow a planner to determine whether the plan it has generated for execution in a static environment will succeed at the time of creation. This is not the case in dynamic environments and means that the planning system must monitor the execution of the steps of the plan XE "plan:monitor" for unexpected failures. Pollack and Ringuette (1990) with the Tileworld simulation, Haigh and Veloso (1998) with the ROGUE robot planning system, and Ambros-Ingerson and Steele (1988) with IPEM, all give practical examples of the mechanisms for interleaving planning and execution. The technology is even starting to be employed in retail consumer products such as the Friendly Robotics automatic lawn mower and the, as yet unlaunched, Dyson DC06 automatic vacuum cleaner. Haigh and Veloso (1998) highlight the results of Agents’ actions and dynamic events as the types of occurrences that need to be monitored by the planning system. Both are relevant to this project. They also discuss approaches, such as Conditional and Probabilistic planning, which can be adopted for dealing with the unexpected events. These approaches are not particularly scalable and quickly become unwieldy. Haigh and Veloso (1998) pose the question of when the planning system should stop planning and start executing. They discuss Anytime planner XE "planner:anytime" s that are interruptible and will return the best plan found to-date. Whilst this is a consideration for extremely dynamic environments such are Pollack and Ringuette's (1990) Tileworld, where opportunities are lost if planning consumes too much time, it will not be implemented for this project's planner. Russell & Norvig (1995) also cover approaches to planning in dynamic environments, and suggest the simplistic approach that I have adopted for the project. The plan XE "plan:monitor" ning system will simply use execution monitoring to detect when a failure has occurred, at which point it will re-plan. The plan/execute/monitor cycle is covered as part of the design of the evaluation system in Chapter 5. Representation of Time XE "time" The project needs to implement a representation of time that will allow the Classical assumptions of instantaneous plan generation and the instantaneous application of Agent effects to be removed. The representation of time is a crucial consideration for AI Planning, a fact that has been highlighted by many researchers - Hanks et al. (1993), Chapman (1987), Allen (1984) and Russell and Norvig (1995). Allen (1984) even puts forward a temporal logic for reasoning about actions to overcome what he sees as the inadequate representation of plans. Time as a resource to be consumed also needs to be considered. Pollack and Ringuette (1990) highlight the importance of time as a resource in their dynamic Tileworld test bed by equating the time spent generating a plan XE "plan:generation" to lost opportunity in the dynamic environment. The evaluation system therefore needs to implement a representation of time that allows plan generation and action effects to be non-instantaneous. Time consumption by the planner needs to be a consideration of the plan/execute/monitor cycle. Comparison of Planner Types There are a number of different planning approaches and it was necessary to select one as the basis of the planning algorithm to test against the dynamic environment. The following comparison of planner types is by no means exhaustive, but gives the background to the reasons for the choice of UCPOP. Whilst Russell and Norvig (1995) see the original STRIPS planner XE "planner:classical" as too restrictive for complex systems, there have been practical implementations of Classical Planners in real-world environment XE "environment:real-world" s. The implementations have ranged from manufacturing systems to spacecraft control systems, but the common factor about the environments in which they operate is that they are all designed to be controlled. The dynamic environment for this project is not designed to be controlled and the planning algorithm will therefore not be based on a STRIPS-like planner. Ginsberg (1993) comments that non-linear (Partial Order) plans are more expensive to compute than linear (Total Order) ones, but require less search XE "search:space" space. Given the practical constraints of the implementation platform - 0.5MB of memory on the Sony Playstation XE "Sony Playstation" - a Partial Order planner XE "planner:partial order" is preferable. Minton et al. (1992) also present evidence to suggest that Partial Order Planners are more efficient when using more expressive knowledge representation languages. Sacerdoti's Noah planner and Tate's Nonlin pioneered Partial Order Planning in the latter half of the 1970s. Chapman's (1987) Tweak planner built on the ideas of Sacerdoti and Tate, amongst others, to present a logical formalisation for Partial Order Planning. Chapman saw this as a 'neat' formalisation of previous, 'scruffy' work, but still felt that Tweak was so constrained that it was practically useless for real-world application. Weld (1994) notes that Partial Order Planners (POPs) take a least commitment approach, only imposing order between plan steps when they are absolutely necessary. The steps are associated to the goals by Causal XE "causal link" Links. Causal Links associate the effects of an action with the goal precondition XE "precondition" s that they meet. They allow a planner to determine when the addition of new plan steps will threaten to invalidate the partial plan already in place. Weld (1999) notes that Causal Link Planners have received less research attention in the second half of the 1990s as they are comprehensively outperformed by Propositional Planner XE "planner:propositional" s over a range of problem domains. Weld (1999) describes Blum and Furst's (1997) Graphplan propositional planner as simple, elegant and speedy. Another area of research has been into Reactive Planner XE "planner:reactive" s. Rich and Knight (1991) describe how Reactive Planners avoid planning altogether. Instead of making a plan, they choose individual actions from a knowledge base depending on the situation in which they find themselves. McCarthy and Pollack (2000) note that this is the approach adopted by robot designers, ignoring more proactive planners. Agre and Chapman (1987) also present Pengi XE "Pengi" , a practical implementation of a Reactive planner. Pengi is built on the premise that the real-world is so complex and uncertain that traditional planners would never scale up to be useful. Pengi merely reacts to the immediate threats and opportunities in its environment, which is provided by the 1980s video game Pengo. Reactive planners do highlight the shortcomings of Classical planners, namely the need to interleave planning and execution. They also highlight that resources are finite, particularly time XE "time" , which Russell and Norvig (1995) claim is the most important resource of all. Completely Reactive Planners, such as Pengi, are an extreme approach and have met with limited success, Ginsberg (1993) noting that prior analysis of all possible situations is impossible. It is therefore unlikely that this project will require anything as reactive. McDermott and Hendler (1995) report on an interesting hybrid approach that couples the strengths of traditional planners to plan ahead with the ability of reactive planners to deal with uncertainty XE "uncertainty" and dynamic environments. On a practical basis, Cohen et al's (1989) Phoenix simulator included reactive, tactical responses to virtual fires in front of bulldozers rather than consume time computing a strategic solution. Again, the requirement for this project to implement a hybrid approach is questionable and will not be progressed. In making the choice of which planning approach to implement, the need to restrict the search XE "search:space" space used on the implementation platform is paramount and therefore rules out the use of linear, total order planner XE "planner:total order" s. Secondly, whilst the proposed environment is dynamic, it will afford the Agent the time to construct plans to reach its goals. The project's planner will therefore not be based on reactive algorithms. The final choice between a causal link, partial order planner and a propositional planner is far closer. Blum and Furst (1996) show Graphplan XE "graphplan" to be significantly quicker that both UCPOP XE "UCPOP" and Prodigy in a number of problem domains, but the test domains they use are static rather than dynamic. Weld (1999) also makes several points in favour of causal link planners. He states that they are less sensitive to irrelevant information in the initial state and perform better in domains with incomplete information. Weld (1999) also questions the scalability of Graphplan enhancements to deal with uncertainty XE "uncertainty" . McCarthy and Pollack (2000) echoed these concerns and based their own research about dynamic environments on the UCPOP algorithm. So, even though uncertainty within the domain is not an issue for this project, and taking into account the substantial performance advantage of Graphplan and derivative planners, and appreciating that AI Planning is an active research field with advances being made all of the time, I still believe that causal link algorithms are generally more applicable, having proven real-world examples of use within dynamic environment. The planning algorithm will therefore be based on Penberthy and Weld's (1992) UCPOP algorithm. UCPOP XE "UCPOP:algorithm" Penberthy and Weld's (1992) UCPOP is a causal link, partial order planner that supports universal quantification and conditional effects. It handles a subset of Pendault's ADL - Action Description Language. It is both sound and complete. Starting with an initial, dummy plan, the algorithm attempts to add new steps and constraints to achieve goals. It works in a recursive manner, with the main loop attempting to find an action to apply that will meet an outstanding goal precondition. The MGU theorem prover that links goal preconditions to action effect post conditions achieves this. Any new preconditions of the applied action are also added to the outstanding goal list. It maintains a record of the causal link XE "causal link" s between action effects and the goal precondition XE "precondition" s that they fulfil. Through examination of the causal links, it is able to identify threats to actions already in place and has strategies to endeavour to resolve them. Recursion continues until all of the outstanding goal XE "goal:preconditions" preconditions have been met - and a valid plan therefore constructed - or until a valid action can no longer be found and the plan generation fails. On a practical basis, the planning algorithm has been implemented in real-world situations that require a plan/execute/monitor cycle by researchers such as McCarthy and Pollack (2000). Heuristic XE "heuristic" Enhancements Chapman (1987) states that planners need to be domain independent to be generally useful. The consensus of research opinion shows that this is not achievable as Gaines (1996) states by saying:- "Designing a domain independent, sound planner that can work efficiently in a complex domain is an infeasible task." Penberthy and Weld (1992) state that they believe that domain dependent heuristics offer the best possibility for performance enhancements. Aylett et al. (2000) also argue that whilst domain independent planners are powerful, they require domain specific problem solving knowledge to achieve acceptable performance. Even Blum and Furst (1997) indicate that they would be looking at heuristic enhancements to their Graphplan XE "graphplan" planner even though it had already produced a performance leap of an order of magnitude greater than the previous generation. The project will therefore need to implement heuristic enhancements. Gaines' (1996) GTD-POP planner adds causal domain knowledge, whilst Weld (1994) suggests that enhancements are often made to the search algorithms employed. Weld (1994) also notes that it is often necessary to sacrifice Completeness in a trade off against performance. Chapman (1987) himself concludes by saying that there may only be heuristic solutions rather than general purpose ones. This I believe to be the case given the subsequent weight of research. The project is therefore justified in adding heuristic performance enhancements to the generic UCPOP planning algorithm it will be assessing in the dynamic environment, although this will have a bearing on the generic conclusions that can be drawn from the results. Heuristics added to the system are described in Chapter 5. Summary This chapter has set out the requirements of the planner, highlighting the need to interleave planning/execution/monitoring and emphasising the importance of the representation of time. It has provided a comparison of the different approaches to planning and stated the reasons for the choice of UCPOP. An overview of the operation of UCPOP has been given and the intention to make heuristic enhancements stated. Evaluation System Requirements Introduction This chapter documents the requirements of the dynamic environment XE "environment:dynamic" and the system that will allow the planning algorithm to be evaluated. Requirements of the Dynamic Environment Hanks et al. (1993), in a discussion about the effectiveness of test-beds state that the ultimate interest is in real-world systems working in complex environments. Unfortunately it is not always possible to build real-world examples and it is therefore necessary to build simulations, as I am aiming to do as part of this project. Simulations need to exhibit certain characteristics to be useful. The requirements of the dynamic environment for this project are:- As with Cohen et al's (1989) Phoenix system, the dynamic environment I am going to build needs to have variable parameters. Phoenix allowed parameters such as wind speed to affect the rate and direction in which fires spread. The environment I am building needs to have parameters that can be varied to represent the differing test scenarios. There also needs to be dynamic events that are not initiated by Agent actions, but still affect the state of the environment. This feature overcomes the Classical planning assumption of all changes to the environment being the result of Agent actions. The environment will contain an Agent XE "agent" and the Agent will need to have the concept of goals that it will endeavour to achieve. The Agent will generate and execute plans to achieve the goals. Unlike the system described by Haigh and Veloso (1998) Agent goals will be set at the start of the simulation. The Agent will not have to deal with additional goals being raised during the test scenario execution. As previously mentioned, it is absolutely essential that the environment has a well-defined model of time XE "time" to cover Agent actions and dynamic events. Time is highlighted as an essential component of a simulator by Hanks et al. (1993). The actions that the environment will simulate must affect changes in the environment XE "environment:dynamic" that will allow the Agent to meet its goals. Actions must be governed by the model of time to make them non-instantaneous. They also need to exhibit non-determinate elements, to allow the simulation of real-world actions that fail to achieve their expected effects, and necessitate the need to monitor the results. Finally the environment must be measurable and allow metric information to be gathered for later analysis. For the purposes of simplifying the project, the Agent XE "agent:complete knowledge" will be supplied with complete information about the environment, as was the case with Pollack and Ringuette’s (1990) Tileworld. This removes uncertainty XE "uncertainty" and the need to gather information through sensing. It also removes the need for the project to consider the extensive AI planning field that covers uncertainty. I have also excluded the concept of belief from the scope of the project. Allen (1984) proposes an Agent XE "agent:belief" model of the environment that includes a belief about the future state. The Agent I am going to implement has no belief model and will not be able to reason about future events. It will base all of the decisions it makes on the current state of the environment. Requirements of the Evaluation System The evaluation system must address the following requirements:- It must manage the plan XE "plan:execution" /execute/monitor cycle to allow the Agent to apply its plans to the environment to achieve its goals. The cycle must cater for the representation of time to allow for the removal of Classical assumptions called for by Chapman (1987) and demonstrated by Ambros-Ingerson and Steele (1988) with their IPEM system. It must also include an implementation of Penberthy & Weld’s (1992) UCPOP XE "UCPOP" planning algorithm. Weld (1994) states that planning algorithms are most often formulated as search XE "search:algorithm" problems and search algorithms will therefore be required. Both the planning and the search algorithms will require domain specific heuristic XE "heuristic" enhancements to attain the performance gains indicated by Gaines (1996). The system needs to implement a Knowledge Representation XE "knowledge representation" language, such as the one described by Blythe (1996), that will be able to accurately describe the dynamic environment, the effects of actions and the dynamic events. The system must also allow all of the test scenarios to be represented and executed. As Cohen et al. (1989) mention, it is important that the scenarios are repeatable and reproducible. The system must gather the resulting information output from the execution of the test scenarios. Lastly, whilst not essential, it will be useful for the system to be able to display a graphical representation of the current state of the environment as it executes. Summary This chapter has set out the requirements of the environment and the system that will allow the UCPOP algorithm to be evaluated. Evaluation System Design Introduction This chapter describes the design of an evaluation system to meet the requirements detailed in the previous chapter. It describes functionality to integrate the planning algorithm, the search algorithms, and knowledge representation with the plan/execute/monitor cycle required for operation in a dynamic environment. In searching for a methodology to apply to the design stage it became apparent that the KADS Knowledge Based Systems (KBS) approach, used in the Open University's Intelligent Systems course, had been previously applied to AI Planning problems. Punch et al. (1996) describe a real-world architecture called IPCA that integrates a KADS GTM (Generic Task Model) with real time planning, execution and monitoring. Hawkins et al. (1996) provide further evidence for the IPCA use of GTMs in practical environments, although they do qualify that innovation is not a requirement of the domain. Against the use of KBS techniques, Aylett et al. (2000) question whether techniques such as KADS, or the updated CommonKADS, are able to handle the computational complexities associated with AI Planning problems. Having investigated the potential use of KBS methods, it is apparent that neither the KADS Planning GTM, as described by Tansley and Hayball (1993), or the CommonKADS Synthesis GTM, as described by Schreiber et al. (2000), provide good templates for the design of the planning system. This is because, having chosen UCPOP XE "UCPOP" as the algorithm to use, the knowledge based elements need to be built around the way in which UCPOP works rather than from the template set out in the GTMs. Adopting this approach drives the methodology from the wrong end and negates many of the benefits it provides. Therefore, although the design does not use a knowledge-based methodology, it does use some of the more generic techniques that they encompass. Examples of which are the process models and the state transition diagrams. All of which have proven useful. It is also interesting to note that Schreiber et al. (2000) state that there isn’t much literature about the structured design and implementation of knowledge-based systems, with many often progressing straight from the high-level design model to application code. The objective of the design stage remains the same - to design a system that successfully interlinks planning, execution and monitoring in a dynamic environment and allows the data to be collected from the execution of the test scenarios. Design of the Dynamic Environment XE "environment:dynamic" Representation of Time XE "time" Time is a crucial resource when plan XE "plan:dynamic" ning in dynamic environments. It is a recurring theme throughout AI Planning literature and is emphasised by Chapman (1987), Russell & Norvig (1995) and Allen (1984) amongst others. With robotic examples of AI Planning systems, such as Haigh and Veloso’s (1998) ROGUE architecture, the planner must work in real-time to execute and monitor plans in a real-world environment. With simulations, this is not the case as shown with Cohen et al’s (1989) Phoenix system. Phoenix uses a simulation of real-time to overcome the constraints of its implementation platform, which runs tasks in serial rather than parallel. This project will also use a simulation of real-time, assigning time costs in Turns. Turns are an abstract, yet uniform unit of measurement that will be used to assign durations to tasks, such as plan generation and action execution. Environment Objects and Action XE "action" s The domain for the execution of the test scenarios will be made up of a grid of squares. Some of the locations will be accessible and other areas, representing water and walls, will not. The accessible squares will have differing time costs associated with traversing them to simulate differing terrain. Uphill terrain will take more Turns to traverse that level or downhill ground. Table 5-1 details the different types of terrain. Terrain Type Accessible/ Inaccessible Traverse Cost (Turns) Description Accessible 100 Level ground. Accessible 150 Uphill ground. Accessible 50 Downhill ground. Inaccessible NA Wall Inaccessible NA Water Table STYLEREF 1 \s 5 - SEQ Table \* ARABIC \s 1 1 Environment Terrain Types The grid will be kept constant at 10 x 10 squares to prevent the size of the environment having an impact on the information gathered from the test scenarios. Figure STYLEREF 1 \s 5 - SEQ Figure \* ARABIC \s 1 1 Environment Objects Class Diagram There are a number of objects that exist within the domain to make it non-trivial. Figure 5-1 shows the object class diagram. Objects are divided into sub-classes of static objects, which remain in a fixed location, and those which are mobile. Each of the objects is described in Table 5-2 below:- Object GUI Image Static/ Mobile Description Attributes AGENT Mobile Makes plans to apply action effects to the environment to achieve goals. X Position Y Position Got Key BRIDGE Static Prevents access to the location that it occupies unless it is LOWERED. Changes in BRIDGE status represent the dynamic events in the environment. X Position Y Position Status Duration Counter DOOR Static Prevents access to the location it occupies unless it is unlocked and open. X Position Y Position Door Status Lock Status KEY Mobile Used by the AGENT to unlock a locked door. X Position Y Position START Static AGENT start location. X Position Y Position EXIT Static AGENT goal location. X Position Y Position Table STYLEREF 1 \s 5 - SEQ Table \* ARABIC \s 1 2 Environment XE "environment:objects" Objects The Agent XE "agent" object has a number of characteristics:- It has goals that it endeavours to meet. For the test scenarios the constant goal of reaching the EXIT location will be used. It is able to generate plans against the current state of the environment to apply actions to achieve goals. It executes steps from a plan, applying the effects of non-instantaneous action XE "action:non-instantaneous" s to the environment. It monitors the results of attempting to apply non-determinate action XE "action:non-determinate" effects to check whether they have succeeded or failed. There are a number of actions that can be applied to the environment to effect changes to its state. Some of the actions are determinate, the effects of which are guaranteed. Others are non-determinate, requiring the Agent to monitor the results to determine whether changes have been made. Action XE "action:non-instantaneous" s are also non-instantaneous, taking multiple Turns to complete. This addresses the Classical AI Planning assumptions highlighted by Weld (1994), McCarthy and Pollack (2000), and Chapman (1987) amongst others, of all action effects being determinate and the process of plan generation being instantaneous. Details of all of the action XE "action:list of" s that can be applied to the environment are given in Table 5-3. Action Determinate/ Non-determinate Time Cost (Turns) Description MOVETO Non-Determinate Variable Move a MOBILEOBJECT from one physical location to another. PICKUP Determinate 150 The ability for an AGENT to pick up and carry another MOBILEOBJECT. UNLOCK Determinate 175 An AGENT carrying a KEY and located next to a locked DOOR will be able to unlock it. OPEN Determinate 125 An AGENT located next to a closed, unlocked DOOR will be able to open it. RAISE Determinate 0 Change the status of a LOWERED BRIDGE to RAISED, thereby denying access to its location. LOWER Determinate 0 Change the status of a RAISED BRIDGE to LOWERED, thereby allowing access. WAIT Determinate 375 In the event that the AGENT XE "agent:actions" is unable to construct a valid plan to achieve its goals, it will WAIT in anticipation that future dynamic events in the environment will allow it to progress its goals. PLAN Determinate Variable Action to represent the non-instantaneous process of generating a plan. Table STYLEREF 1 \s 5 - SEQ Table \* ARABIC \s 1 3 Actions that can be Applied to the Environment XE "environment:actions" Dynamic events have affects on the environment not caused by Agent actions. The raising and lowering of BRIDGEs cause dynamic events within the environment. The status of each BRIDGE changes from RAISED to LOWERED, and vice versa, after a set duration of Turns. When RAISED, the Agent is unable to access the location occupied by the BRIDGE and traverse inaccessible areas of water. One of the requirements of the dynamic environment is that it should be configurable using parameters. This is enabled by varying the initial configuration of the environment and the numbers of each type of object. The rate of dynamic change within the environment is also variable and can be changed by amending the Turn duration between BRIDGEs RAISING and LOWERING. Formal Representation in PDDL XE "PDDL" A Knowledge Representation XE "knowledge representation" mechanism is required to formally represent the objects and actions that can be applied to the environment. The mechanism must be able to precisely represent the precondition XE "precondition" s that must be true before an action XE "action:representation of" can be applied. It must also be able to represent the effects that an action has on objects and the environment when it is applied. It must support universal quantification, the ability for an action effect to be applied to multiple objects as will be the case when the Agent moves location carrying the KEY. There are many approaches that have been proposed, and generally, the more expressive the language, the greater the degradation in planner performance. Blythe (1996) describes a language for specifying planning problems where there are external events. Penberthy & Weld (1992) chose a subset of Pendault’s ADL (Action Description Language) for UCPOP stating their frustration at the restrictiveness of the STRIPS representation and apprehension at the prospect of implementing full situation calculus. But even though UCPOP has been chosen as the planning algorithm, ADL will not be the Knowledge Representation mechanism. PDDL XE "PDDL" (Problem Domain Definition Language) was an attempt to define a common domain and problem definition language for the AIPS 98 (Artificial Intelligence Planning Systems) planning competition. It is derived from a number of sources, in particular the UCPOP/ADL formalisation. McDermott et al. (1998) describe the syntax of the language in the PDDL Reference Manual. PDDL continues to be developed today with both PDDL+ and PDDL 2.1 in production. PDDL is therefore the chosen Knowledge Representation language. The actions and objects described in the previous section are shown in PDDL formation in Figure 5-2. Figure STYLEREF 1 \s 5 - SEQ Figure \* ARABIC \s 1 2 PDDL XE "PDDL" Representation of Environment Objects and Action XE "action:list of" s Design of the Evaluation System As previously mentioned, Gaines (1996) insists that heuristic XE "heuristic" enhancements are essential to designing efficient planners for complex domains. Gaines (1996) also says that the goal of practical planners is to achieve a compromise between efficiency and soundness through the use of heuristics and causal domain knowledge. The following section sets out the design of a practical planner for the dynamic environment described above. Figure STYLEREF 1 \s 5 - SEQ Figure \* ARABIC \s 1 3 Process Model for Plan/Execute/Monitor Cycle Evaluation System Structure The evaluation system implements the plan XE "plan:execution" /execute/monitor cycle. This provides a framework into which all of the other required functions are added. Figure 5-3 shows the process model for the plan/execute/monitor cycle. In line with the use of the process model in the KADS methodology – Tansley & Hayball (1993) – the processes and information stores have been denoted as either system or knowledge based. The knowledge-based elements deal with the generation of a plan, the selection of the next action XE "action:select next" to apply, and the use of the search algorithms. The knowledge-based areas of the system are described in sections 5.3.2 and 5.3.3. The plan XE "plan:generation" generation section of the cycle is handled by process 7.PLAN. Execution of the plan XE "plan:execution" is covered by processes 5.GET-CURRENT-AGENT-STEP, 6.GET-NEXT-AGENT-STEP, and 8.EXECUTE-STEP. The monitoring for action XE "action:failure" failure and goal attainment is covered by the ACTION-FAILURE part of 8 and 10.CHECK-AGENT-GOALS. The dynamic events within the environment are handled by process 4.DYNAMIC-ENVIRONMENT XE "environment:dynamic events" XE "environment:dynamic events" -EVENTS. This process checks to see whether the status of any of the BRIDGEs should be changed from LOWERED to RAISED and vice-versa. Pseudo code for the dynamic events is show in Figure 5-4. Figure STYLEREF 1 \s 5 - SEQ Figure \* ARABIC \s 1 4 Pseudo Code for Dynamic Events The process model also shows the ongoing collection of metrics for later analysis into the D4 LOG-FILE. Information is collected from both the plan generation and the plan execution parts of the system. A more dynamic view of the execution of the application is given in Figure 5-5. Figure 5-5 shows the Agent XE "agent:state transition diagram" State Transition Diagram. It again covers the plan/execute/monitor cycle, showing the states that the Agent goes through and the external events that influence the transition between states. The plan generation phase is covered by states 1.BUILDING-A-PLAN and 2.SELECTING-APPROPRIATE-ACTION XE "action:execution" . Execution is covered by states 3.SELECTING-NEXT-STEP and 5.EXECUTING-PLAN XE "plan:execution" -STEP. States 4.CHECKING-STEP-IS-EXECUTABLE and 6.CHECKING-FOR-GOAL-ATTAINMENT cover the monitoring phase. Figure STYLEREF 1 \s 5 - SEQ Figure \* ARABIC \s 1 5 Agent State Transition Diagram As well as the functions to handle dynamic events, the plan/execute/monitor cycle and the metric capture, some additional functions were needed. The Playstation implementation platform requires initialisation and shutdown processes both at a system level and when drawing each frame of the display. GUI display functions, and a function to initialise each particular test scenario, also need to be included. Design of the Search XE "search:algorithm" Algorithm Whilst AI Planning is the focus of this project, I have already mentioned the interdependence between planning, search and knowledge representation XE "knowledge representation" . Both Chapman (1987) and Weld (1994) highlight the natural formulation of planning problems as searches, and Weld (1999) also emphasises the importance of search algorithms to performance. So, whilst search is not the main focus, it is a key component. The project has three requirements of the search algorithms, to:- Generate a search XE "search:tree" tree of potential routes between two physical locations. Select the best route between two locations from the search tree, given criteria. Identify any further precondition XE "precondition" s of the Agent using the selected route, such as the need to open a DOOR. Russell and Norvig (1995) provide comprehensive coverage of search algorithms, and the algorithm derived for the project is based on the templates they provide. The search process has two phases. The first phase expands the search tree from an initial node. The second selects a route and identifies any additional goal preconditions that must be met. Figure 5-6 shows a section of an expanded search tree. Each of the nodes represents a physical location within the environment. From each of the nodes, branches are shown to all adjacent, accessible locations. The average number of branches from each node - the branching factor - is an important contributor to the eventual size of the search XE "search:tree" tree. The search tree tends to get very large, very quickly. Whilst there are ways to try to limit the branching, Russell and Norvig (1995) note that some problems are just plain difficult and will always result in large search trees. Terminal nodes are nodes that cannot be expanded any further and Goal XE "goal:node" nodes represent a successful path from the initial node to the Agent's goal location. A cost is associated with each path to allow a decision to be made about which is the more desirable route. Figure STYLEREF 1 \s 5 - SEQ Figure \* ARABIC \s 1 6 Search Tree Russell and Norvig (1995) describe a number of ways in which the search tree can be expanded from the initial node. They cover the simplest brute force approaches through to informed searches that use domain specific knowledge to improve efficiency. As Ginsberg (1993) states, it is infeasible to expand the whole of the search tree and this project will therefore adopt a best-first, depth-limited approach. A number of domain, specific heuristic XE "heuristic" enhancements have been added to increase performance. As Gaines (1996) suggests, the only way to achieve reasonable performance is to introduce domain specific knowledge to limit the expansion of unpromising nodes. The heuristic enhancements are:- As the memory available to hold the search XE "search:tree" tree on the implementation platform is severely constrained, the search tree will be limited to a finite number of nodes. Once the maximum number of nodes is reached, expansion of the search tree ceases. This relies on the best-first approach finding a solution before the available number of nodes is exhausted. The depth to which the search XE "search:depth limit" is performed is also limited. If expansion reaches the depth limit without finding a terminal node, it stops expanding the current branch and backtracks through parent nodes until it finds an unexpanded branch from which it can progress. This is shown at Node 9 on the diagram above. Expansion of a branch will also cease if the cost to the current point is greater than the cost from the initial node to the cheapest goal node. This is shown at Node 5, as Node 8 provides a cheaper route to the goal. For any node, expansion of the accessible child nodes will be ordered so that the nodes that take the Agent closer to its goal location will be expanded first. This represents the 'best first' approach. Figure STYLEREF 1 \s 5 - SEQ Figure \* ARABIC \s 1 7 Search Tree Pseudo Code Hasemer and Dominque (1989) highlight two components for implementing knowledge-based processes - procedural knowledge and declarative knowledge. The procedural knowledge for the search algorithms has been defined in pseudo code. Figure 5-7 shows the Search Tree Expansion process. Pseudo code for the selection of the best route and the identification of further preconditions is contained in Appendix B. To implement the search algorithms requires a structure to hold information about the nodes in the search tree and a structure to hold information about the goal nodes - the declarative knowledge. Details of the attributes of those structures are shown in the tables below:- Attribute Description PARENT-NODE [ARRAY] Array of pointers to parent node. GOAL-NODE [ARRAY] Array of Goal node flags. X-POS [ARRAY] Array of physical X positions. Y-POS [ARRAY] Array of physical Y positions. ACCESSIBLE-DIRECTIONS [ARRAY] [ARRAY] Array of accessible directions from the current node. PATH-COST [ARRAY] Array of path costs to the current node. NODE-COST [ARRAY] Array of current node costs. DEPTH [ARRAY] Array of search depths to the current node. Table STYLEREF 1 \s 5 - SEQ Table \* ARABIC \s 1 4 Search XE "search:node" Tree Node Structure Attribute Description LOWEST-COST-NODE Pointer to the Node in the array with the lowest cost path. LOWEST-COST-VALUE The lowest cost path currently found to the goal. GOAL-NODE[ARRAY] Array of pointers to the Nodes that are Goal Nodes GOAL-NODE-COST[ARRAY] Array of path costs associated with the Goal Nodes. Table STYLEREF 1 \s 5 - SEQ Table \* ARABIC \s 1 5 Goal XE "goal:node" Node Structure Design of the Planning Algorithm As mentioned at the start of this chapter, the design of the application has been driven from the UCPOP XE "UCPOP:algorithm" algorithm, shown in Figure 5-8. Figure 5-8 effectively represents the pseudo code for the planner. The objective of this section is to again adopt the Hasemer and Dominque (1989) approach of breaking the knowledge based process of AI Planning into procedural functions and declarative data structures. Figure STYLEREF 1 \s 5 - SEQ Figure \* ARABIC \s 1 8 UCPOP XE "UCPOP:algorithm" Algorithm, Reproduced with Permission A UCPOP plan is made of the following components:- A set of Steps that have precondition XE "precondition" s that must be met before they can be applied, and effects on the environment when they are executed. A set of Binding Constraints on free variables associated with the set of Steps. A set of Ordering XE "ordering" s that constrain some Steps to be applied before others. A set of Causal Link XE "causal link" s that indicate which Step effects meet which goal preconditions. The UCPOP algorithm also takes two further inputs:- A set of Goals XE "goal" that have not yet been met by the plan being constructed. The set of action schemata that can be applied within the environment. The logical Plan Class Diagram, Figure 5-9, shows the classes, attributes and relationships that represent the components of, and input to the UCPOP algorithm. This forms the basis of the data structures that will be implemented in the evaluation system to represent a plan. Figure STYLEREF 1 \s 5 - SEQ Figure \* ARABIC \s 1 9 Logical Plan Class Diagram The following table gives more details of the links between the components of the UCPOP XE "UCPOP:class diagram" algorithm and the Plan Class Diagram. UCPOP Component Plan Class Entities Set of Steps STEP + EFFECT + PRECONDITION Set of Binding Constraints BINDING Set of Orderings ORDERING Set of Causal Links CAUSAL LINKS Outstanding Goals GOAL + PRECONDITION Table STYLEREF 1 \s 5 - SEQ Table \* ARABIC \s 1 6 Mapping UCPOP to the Logical Plan Class Diagram Central to the procedural requirement of the UCPOP algorithm is the MGU theorem prover. MGU provides the means to identify step effects that match goal XE "goal:preconditions" preconditions, e.g. the actions to apply to meet goals. The evaluation system will implement the MGU function in the following way:- It will implement a set of action schemata with defined action effects. It will implement a set of precondition XE "precondition" s covering each type of goal. Both structures will use a common structure to represent effects and preconditions. A function will be implemented to compare each of the action effects to see whether they meet the preconditions of the outstanding goal being processed. This will address Step #3 of the UCPOP algorithm, the Operator Selection. (See Figure 5-8). Step #4 of the UCPOP algorithm covers Sub-goal Generation and deals with any preconditions that may exist for the Operator selected in Step #3. For the domain that I have created, the goal XE "goal:sub-goal generation" dependencies in Table 5-7 apply and are added to the outstanding goal list when an operator is applied and if the conditions are met. The new goals are addressed in subsequent recursive calls to the UCPOP algorithm instigated by Step #6 Recursive Invocation. Operator Condition Dependent Preconditions MOVETO Closed DOOR on route. OPEN UNLOCK None MOVETO UNLOCK AGENT is not carrying the KEY PICKUP PICKUP None MOVETO OPEN None MOVETO OPEN DOOR is locked UNLOCK None No valid plan WAIT All Operators None PLAN Table STYLEREF 1 \s 5 - SEQ Table \* ARABIC \s 1 7 Operator Dependencies Causal link XE "causal link" s are also added to the plan when an operator is selected. The causal link associates the operator effects with the goal preconditions that they meet. Threat resolution, Step #5 of the algorithm, is handled when new sub-goals are added. The actions and objects that exist within the environment mean that the only threat that has to be handled is the duplication of sub-goals. The evaluation system also needs to deal with the situation whereby there is no valid plan XE "plan:generation" to achieve Agent goals. In this case, the steps so far added to the plan are replaced by a single WAIT action XE "action:WAIT" . Finally, to make the plan generation process non-instantaneous in the simulation of real-time, the planning algorithm needs to do two things. Firstly, a PLAN action XE "action:PLAN" is added to the plan and ordered to be the first action that is executed. Secondly, a duration in Turns is given to the PLAN action using the following calculation:- Plan Duration = 50 + (20 * Number-Steps-In-Plan) Turns The calculation models the fact that a plan with a greater number of steps will take longer to generate. Summary This chapter has set out the methods that have been applied to the design process. It has covered the representation of time and the definition of environment objects and actions in PDDL. It has described the capabilities of the Agent and the way in which the evaluation system will operate. It has also described the way in which the dynamic events will be triggered and the heuristic XE "heuristic" enhancements made to the search routines. Finally, it has covered the procedural and declarative knowledge required for both the search and planning algorithms. Definition of the Test Scenarios Introduction The objective of this chapter is to identify the parameters that will be varied during the tests and to construct specific test scenarios from the constituent objects of the dynamic environment. It also identifies the dependent variable information that will be gathered for later analysis. Parameters The test scenarios will vary three parameters:- The first parameter is the complexity of the environment XE "environment:complexity" . Complexity is a combination of a the number of accessible/inaccessible locations in the environment, the number of objects that generate dynamic events - BRIDGEs - and the number of other objects such as DOORs and KEYs. Increased complexity should result in longer Agent plans and greater demands being put on the search XE "search:algorithm" algorithms. McDermott and Hendler (1995) state that plans get difficult very quickly and I should be able to draw some assessment from varying this parameter. Three different levels of complexity will be used. The second parameter is the rate of dynamic change in the environment. This is represented by the frequency with which BRIDGEs raise and lower. Increasing the frequency should increase the likelihood that the dynamic environment will have changed between Agent plan generation and Agent action XE "action:execution" execution. Three different frequencies will be used. The third parameter will be the location at which the Agent begins the test scenario. Three different locations will be used. This will test the effect of time interactions between Agent plans and the events within the dynamic environment. Collection of Metric Information Tables 6-1 and 6-2 below detail the metric information that will be gathered from the execution of the test scenarios. It covers information that will assess the complexity of the environment, complexity of the plans generated, the complexity of the searches required and the success rate for the application of plans to the dynamic environment. Reference Description C-SIZE A record of the size of the dynamic environment XE "environment:metrics" . This will be kept constant at 10x10 for the test scenarios. C-ACCESSIBLE The number of locations that are accessible to the AGENT. C-INACCESSIBLE The number of locations that are inaccessible to the AGENT. C-DYNAMIC-OBJECTS The number of dynamic objects – BRIDGEs – in the environment. C-OTHER-OBJECTS The number of other objects – DOORs, KEYs – in the environment. C-TOTAL-OBJECTS The total number of objects. C-CHANGE An indication of the rate of dynamic change in the environment. C-COMPLEXITY An indication of the complexity of the environment. Table STYLEREF 1 \s 6 - SEQ Table \* ARABIC \s 1 1 Metrics to be Gathered About the Environment Reference Description P-PLAN XE "plan:metrics" -NUMBER Number of the current plan. P-NUM-PRECONDITION XE "precondition:metrics" S Number of preconditions in the current plan. P-NUM-ACTION-EFFECTS Number of action effects in the current plan. P-NUM-CAUSAL-LINKS Number of causal links in the current plan. P-NUM-ORDERING XE "ordering:metrics" S Number of orderings in the current plan. P-TOT-PLAN-NUMBER Total number of plans that have been generated. P-TOT-NUM-PRECONDITIONS Total number of preconditions in all plans. P-TOT-NUM-ACTION-EFFECTS Total number of action effects in all plans. P-TOT-NUM-CAUSAL-LINKS Total number of causal link XE "causal link:metrics" s in all plans. P-TOT-NUM-ORDERINGS Total number of orderings in all plans. E-PRECONDITIONS-MET Total number of preconditions that have been successfully met during execution. E-PRECONDITIONS-FAILED Total number of preconditions that have failed during execution. E-NUM-SEARCH XE "search:metrics" ES Total number of searches conducted. E-NUM-SEARCH-NODES Total number of nodes generated during searches. E-NUM-GOAL-NODES Total number of nodes that are goal XE "goal:metrics" nodes. Table STYLEREF 1 \s 6 - SEQ Table \* ARABIC \s 1 2 Metrics Gathered During Test Scenario Execution Test Scenarios 27 test scenarios will be executed, covering three levels of complexity, against three levels of dynamic changes, starting from three different locations. Environment XE "environment:complexity" Complexity Table 6-3 shows the weighted values assigned to each of the components of the complexity rating. The resulting graph suggests a linear progression. Table STYLEREF 1 \s 6 - SEQ Table \* ARABIC \s 1 3 Scenario Complexity Calculation and Weightings Figure STYLEREF 1 \s 6 - SEQ Figure \* ARABIC \s 1 1 Scenario Complexity Graph Rate of Dynamic Change For the rate of dynamic change a simple multiplier was applied to the durations between the dynamic events - the changing of BRIDGE status between RAISED and LOWERED. Rates at level 2 and 3 therefore had durations of 2 and 3 times longer respectively. Agent Start Location For the final parameter, the Agent starting location was varied. None of the starting locations, in any of the scenarios increased the number of obstacles between the Agent and its goal. Constants A number of factors were kept constant across all of the test scenarios in an attempt to reduce experiment noise in the data collected. The constant factors were:- The Agent retained the same goal throughout, to endeavour to reach the EXIT. The EXIT remained in the same location. The overall size of the environment remained constant at 10 x 10 squares. The set of action XE "action:list of" s that could be applied to the environment remained constant. The same planning and search algorithms were used for every scenario. Detailed information, showing the starting values for all of the parameters in each scenario, is contained in Appendix D. Summary This chapter has identified environment complexity, rate of dynamic change and Agent start location as the parameters that will be varied. It has also detailed the data that will be captured for analysis from the test scenario execution. Implementation of the Evaluation System Introduction The objective of this chapter is to build an evaluation system that will allow data to be gathered from the execution of the test scenarios against the implementation of UCPOP. The approach taken was to map the design elements to C functions and structures. The functions and structures were then combined with Sony's proprietary framework code to allow the application to run on the Playstation. Implementation Platform The prototype application was implemented on a Sony Net Yaroze Playstation. The Playstation provides real-world, limited resource constraints on the planning algorithm in the form of the amount of processor power available from the customised 33Mhz MIPS R3000 CPU and the 0.5 MB of memory available to bespoke code and data. Figure STYLEREF 1 \s 7 - SEQ Figure \* ARABIC \s 1 1 Application Development Process Figure 7-1 shows the development process used to create the application. The application code was written in C using the Metrowerks CodeWarrior IDE. Source code was compiled and linked with the proprietary Playstation Runtime Library to create an executable file. The Runtime Library provides functionality to allow communication with Playstation hardware such as the video device. Graphical images used to construct the display were created in JASC Paintshop Pro and then converted into the proprietary Sony TIM format using Sony’s TIM Utility. Figure STYLEREF 1 \s 7 - SEQ Figure \* ARABIC \s 1 2 Application Execution on the Sony Playstation XE "Sony Playstation" Figure 7-2 shows the process of executing the application. Execution of the compiled coded is initiated from the PC across a serial interface, using Sony’s DOS based SIOCONS application. SIOCONS provides the mechanism to load and execute the application on the Playstation. It also allows output from the Playstation to be received back onto the PC. This is the mechanism by which the metric information from the execution of the test scenarios was gathered. Evaluation System Drawing on the evaluation system design presented earlier, the functional breakdown structure shown in Figure 7-3 was constructed. The structure has three main areas, initialisation of the hardware and test environment, the main loop that covers the plan/execute/monitor cycle, and the system close down. Each of the features has been implemented as C structures and functions. Figure STYLEREF 1 \s 7 - SEQ Figure \* ARABIC \s 1 3 Functional Structure of the Application Prototype Implementation design details for all of the functions and structures, are given in Appendix C. The application code is included on the accompanying diskette. Figure STYLEREF 1 \s 7 - SEQ Figure \* ARABIC \s 1 4 Source File Structure Figure 7-4 shows the source files into which the functions are grouped. The source file _psstart.c provides the mechanism to start the bespoke application once it has been loaded onto the Playstation. NY_Prefix.h and Pad.h are Sony header files that contain presets for controlling the Playstation components and peripherals. LIBPS.A.lib is the proprietary Sony framework that provides functions to control the hardware. The rest of the source files and headers were created for the project. A common function of all sections of the system is the output of data from the test scenarios for later analysis. Information is output via the C command PRINTF. Rather than display information on-screen, the compiler inserts code that directs the output back to the PC via the serial port. The following sections describe the function of each of the main areas of the system. Initialisation The initialisation part of the system sets the configuration of the hardware and the dynamic environment. Information about the configuration of the specific test scenario being executed is output to a log using the PRINTF command. Main Loop The main loop handles the plan XE "plan:execution" /execute/monitor cycle. It initiates the dynamic events in the environment and handles both the creation of the GUI representation of the environment and the technical aspects of the Playstation display. The main loop also handles the implementation of the representation of time XE "time" . Each time the main loop is started, the Turn is incremented by one. Details of the implementation of the UCPOP planner and the search algorithms are given later in this chapter. Close Down The close down section of the application outputs the final set of metrics before performing hardware shutdown tasks. Dynamic Environment XE "environment:dynamic" Environment Objects Each of the objects in the dynamic environment – AGENT, KEY, EXIT, START, DOOR and BRIDGE – was implemented as a C structure. The implementation was derived directly from the attributes listed in Table 5-2. The definitions are shown in Figure 7-5. Figure STYLEREF 1 \s 7 - SEQ Figure \* ARABIC \s 1 5 C Structures for Environment Objects Dynamic Events The dynamic events were implemented in function DynamicEvents. The function applies either the action RAISE or LOWER to a BRIDGE when the duration timer reaches zero. Application of the actions is determinate and instantaneous, but this is not an issue for the evaluation system, as the location accessibility it controls is only affected when the action is completed. The DynamicEvents function is called as part of the main loop. Goal XE "goal:preconditions" Precondition XE "precondition" s And Action XE "action:effect" Effects The dynamic environment needed a mechanism for representing the preconditions that need to be true before a goal can be met. Figure 7-6 shows the C structures Precondition_Struct and Effects_Struct that hold the arrays of goal preconditions and action effects applicable to the environment. Figure STYLEREF 1 \s 7 - SEQ Figure \* ARABIC \s 1 6 C Structures for Goal Preconditions and Action Effects Both of the structures use a third, common structure – Condition_Struct – which is used by the equivalent of UCPOP XE "UCPOP:MGU" ’s MGU function to find an action with effects that meet a goal’s preconditions. The preconditions and effects arrays are populated by the function Precondition XE "precondition" EffectInit at the start of execution. Table 7-1 shows the common settings of the Conditions structure that indicate which Effect can be applied to meet which Precondition. Parameters #1 - #3 are constants, Values #1 - #3 are variables that are dependent on the state of the environment when the precondition or effect is generated. Values #1 - #3 are used to implement UCPOP's Binding Constraints on variables. Common Condition Structure Pre-condition Effect Parameter #1 Value #1 Parameter #2 Value #2 Parameter #3 Value #3 AT MOVETO AGENT Null X Position Destination X Position Y Position Destination Y Position DOOR-OPEN OPEN DOOR DOOR Index Number Open Status Open Status Null Null DOOR-UNLOCKED UNLOCK DOOR DOOR Index Number Lock Status Lock Status Null Null GETKEY PICKUP KEY Null KEY Status Carried status Null Null Table STYLEREF 1 \s 7 - SEQ Table \* ARABIC \s 1 1 Common Condition Settings for Preconditions and Effects Planning Algorithm Figure 7-7 shows a more detailed breakdown of the structure of the process to build a plan XE "plan:algorithm" . The main components of the plan building functions are described below. Figure STYLEREF 1 \s 7 - SEQ Figure \* ARABIC \s 1 7 Structure of Plan Creation Functions The functional structure of the plan creation process is built around the implementation of the UCPOP plan data structure. Taking the Plan Class diagram from the design stage - Figure 5-9 - the Entity Relationship Diagram shown in Figure 7-8 was constructed. Figure STYLEREF 1 \s 7 - SEQ Figure \* ARABIC \s 1 8 Plan Entity Relationship Diagram A couple of changes were made from the logical class diagram for the physical implementation. The UnmetPreconditions structure combines the GOAL, PRECONDITION and BINDING classes. The ActionEffect structure combines the STEP, PRECONDITION and BINDING classes. C structures, shown below, were implemented from the Entity diagram and the attributes of the Plan Class diagram. Figure STYLEREF 1 \s 7 - SEQ Figure \* ARABIC \s 1 9 Plan Data Structures UCPOP XE "UCPOP:MGU" MGU Implementation MGU is the UCPOP function that finds actions to meet goal precondition XE "precondition" s. It is implemented in the evaluation system by the FindEffect function. The FindEffect function loops until all UnmetPreconditions have been addressed or plan XE "plan:generation" XE "plan:MGU" generation fails. ActionEffects, CausalLinks and Ordering XE "ordering" s are added to the partial plan to meet each UnmetPrecondition. It involves matching the parameters and values in the Condition structure of the Effect to the parameters and values in the Condition structure of the Precondition. It may also include the use of the search algorithm to select a physical route between two locations. The function GetRoutePreconditions adds UnmetPrecondition entries into the partial plan for each precondition the Agent will need to meet, to allow it to traverse the selected route. The implementation of the search routines is covered later in the chapter. Given the time constraints on the implementation of the evaluation system, the operator dependencies shown in Table 5-9 have been hard-coded. The operator dependencies generate additional UnmetPreconditions that subsequent iterations of the FindEffect loop will attempt to find actions for. Should the plan XE "plan:generation" ner determine that there is no valid plan to achieve the Agent's goal, it replaces the partially constructed plan with a single WAIT action XE "action:WAIT" . The final phase of constructing a plan is to add a PLAN action XE "action:PLAN" . PLAN is ordered so that it is the first action to be executed and is used to show that the process of plan generation is non-instantaneous. Figure STYLEREF 1 \s 7 - SEQ Figure \* ARABIC \s 1 10 Example Plan Figure 7-10 shows an example of a partial order plan constructed by the planner. UnmetPreconditions can only be addressed during execution once all of the dependent UnmetPreconditions have been met. In the example, once 9.Plan-Generated has been met the planner can choose to either address 8.Agent-At-Key or 6.Agent-Adjacent-Door. The function that chooses the next action to apply is covered later in section 7.7.2. Search XE "search:algorithm" Algorithm Figure 7-11 shows a more detailed breakdown of the search algorithm functions. It is derived from the pseudo code presented in the design section. Figure STYLEREF 1 \s 7 - SEQ Figure \* ARABIC \s 1 11 Search Functions Structure Drawing on the declarative knowledge requirement of the design stage, the following entity relationship diagram was constructed. Figure STYLEREF 1 \s 7 - SEQ Figure \* ARABIC \s 1 12 Search Entity Relationship Diagram This, in turn, was converted into the C data structures shown in figure 7-13 by adding attribute information from Tables 5.4 and 5.5. Figure STYLEREF 1 \s 7 - SEQ Figure \* ARABIC \s 1 13 Search C Structures Generation of the Search XE "search:tree" Tree The function GenerateSearchTree was implemented from the design pseudo code. It takes two sets of X & Y coordinates and creates a search tree of potential routes between the two. Each node added to the search tree represents a physical X,Y location in the environment. Checks are performed to ensure that the location is accessible – e.g. it isn't WALL or WATER - and that it hasn't been processed before. Heuristics, added to limit the expansion of unpromising nodes, include:- Ordering the nodes for expansion so that the Agent moves closer to its goal first. Limiting the depth to which a path is searched to 32 nodes. Checking that the partially expanded path is not more costly than the cheapest goal route already found. As mentioned earlier, available memory is extremely constrained on the implementation platform and the total number of nodes in the search tree is therefore limited to 2000. Route Selection The SelectRoute function selects the physical route with the cheapest Turn cost from the search tree. The routine then works backwards from the goal node, via parent node references, copying the sequence of X,Y locations into the Agent Route structure. Identification of Route Precondition XE "precondition" s The GetRoutePreconditions function is used during planning to identify any preconditions that need to be met before the Agent can traverse the physical route selected. It is implemented from the pseudo code shown Appendix B. Plan Execution and Monitoring Figure 7-14 shows a more detailed breakdown of the structure of the plan XE "plan:execution" XE "plan:monitor" execution and monitoring functions of the system. Figure STYLEREF 1 \s 7 - SEQ Figure \* ARABIC \s 1 14 Structure of Execute and Monitor Functions Execution and monitoring also uses the Plan_Struct structure shown below to track the status of actions being executed and the overall plan status. Figure STYLEREF 1 \s 7 - SEQ Figure \* ARABIC \s 1 15 Plan Data Structure Execute Plan ExecutePlan XE "plan:execution" deals with the application of non-instantaneous, non-determinate action XE "action:execution" effects to the environment. It performs a number of execution and monitoring tasks. It initially checks to see whether the Agent has a valid plan. If it hasn't, then it calls the BuildPlan function to create one. The routine then checks to see whether the Agent has a current action to progress. If it hasn't, it calls the GetNextAction function to select one. Once the Agent XE "agent:plan" has a valid plan and action, the ExecutePlan routine progresses the non-instantaneous action by decrementing its duration counter. If the counter reaches zero then the effects of the action are applied to the environment. ExecutePlan includes code to monitor the application of action effects to see whether they were successful or not. For successfully applied actions, the routine updates the current status of the Agent's plan. The CausalLinks, ActionEffects, UnmetPreconditions and Ordering XE "ordering" s associated with the successful action are all flagged as complete. Action XE "action:failure" failures invalidate the Agent plan and cause a new plan to be generated on the next iteration of the plan/execute/monitor cycle. Select the Next Action XE "action:select next" Once an action has been completed, the plan XE "plan:execution" ner must choose the next action to apply from the plan. With a partially ordered plan there may be several candidate actions that can be progressed. GetNextAction identifies the set of available actions and chooses the one with the shortest duration to apply next. The selection of the cheapest action is based on the heuristic XE "heuristic" assumption that the shorter the duration, the less likely that the dynamic environment will have changed and therefore the greater the chance of successfully applying the action effects. Test Scenarios All of the parameters to define each particular scenario were set during the initialisation phase. Appendix D shows the environment configuration, numbers and attribute settings for each object, and the rate of change assigned to BRIDGEs for each scenario. Summary This chapter has described the constraints of the implementation platform and the method of translating the design components into C functions and structures. It has covered the implementation of the UCPOP planning algorithm, the search algorithms and other major elements, such as the dynamic events. Results of the Execution of the Test Scenarios Introduction This chapter describes the methods used to conduct the tests and the results that were obtained. Method of Execution Experimentation was the method used to conduct the tests. Values were set for the independent variables, the test executed, and measurements for the dependent variables taken. The method of execution for each of the 27 test scenarios was that:- Values were set in the application source code for each of the independent variables - the environment configuration, the rate of dynamic change, and the Agent starting location. The application source code was compiled and linked into executable code on the PC. The executable code was downloaded and executed on the Playstation. Data output from the application during execution was captured to text files on the PC via the serial link. The text file was converted and loaded into Microsoft Excel, where it was amalgamated with the results from the other test scenarios. Once all of the individual test results had been gathered, the figures were averaged by the independent variables being manipulated - complexity, rate of change and start location. Results Obtained The averaged results obtained from the tests are included in Appendix E. Full results and the output from each of the 27 scenarios are included on the accompanying diskette. Summary This chapter has explained the method used to execute the tests and stated where the results can be found. Discussion of Test Results This chapter presents a quantitative analysis of the planner performance against the test scenarios, generalisation of the results captured, and comments on experiment noise. Analysis of the Test Results Mean values were calculated from the results of the 27 test scenarios for each of the variable factors - the environment complexity, the rate of dynamic change and the Agent start location. Analysis of each is presented below, but it is first necessary to highlight what appears to be an anomaly in the data. One of the scenarios processes over 52,000 search XE "search:node" nodes to reach the Agent goal, substantially more than any other scenario and far above the mean figure. Figure STYLEREF 1 \s 9 - SEQ Figure \* ARABIC \s 1 1 Number Search Nodes Examined by Scenario This is probably caused by the interaction of the time taken for the Agent to apply actions and the changing state of the dynamic environment. The data from this scenario has not been excluded from the results and does have some impact on the mean figures. Environment XE "environment:complexity" Complexity very small and would need to be increased to provide conclusive proof. Figure STYLEREF 1 \s 9 - SEQ Figure \* ARABIC \s 1 2 Plan, Preconditions, Action Effects and Ordering XE "ordering" s by Complexity The effect of the complexity rating for each of the three environments on the results also needs to be considered. The graph plotted earlier, in Figure 6-1, shows a linear progression between configurations, but again, the data sample is too small to be conclusive. In summary, there is a direct relationship, which subjectively appears to be exponential, assuming that the relationship between the complexity of the environments is linear. There is also a positive correlation between the increasing complexity and the number of plan precondition XE "precondition" s generated, met and un-attempted, Figure 9-3. The number of un-attempted preconditions - preconditions that are generated as part of the plan but never addressed as part of the execution - reaches 69% in the most complex scenario. This seems to indicate that making long term plans in very complex environments is wasteful of processor time and that perhaps a shorter, more reactive approach would be more applicable. McDermott and Hendler’s (1995) point about couple traditional planners and reactive planner XE "planner:reactive" s may well have an application in this type of scenario. Figure STYLEREF 1 \s 9 - SEQ Figure \* ARABIC \s 1 3 Preconditions Met, Failed and Un-attempted by Complexity There is a negative correlation between the number of searches performed and the total number of search nodes processed as complexity increased, Figure 9-4. Figure STYLEREF 1 \s 9 - SEQ Figure \* ARABIC \s 1 4 Searches Performed and Search Nodes by Complexity This seems to indicate that whilst the number of searches performed increases in more complex environments, the searches are more localised and the total number of nodes processed falls. Rate of Dynamic Change Figure STYLEREF 1 \s 9 - SEQ Figure \* ARABIC \s 1 5 Mean Results for Dynamic Rate of Change The mean figures for the rate of dynamic change show parabolic curve formations. The highest values are associated with the highest rate of change, falling to the lowest values for the medium rate of change, before rising again at the lowest rate of change. Further investigation with additional data is required to draw definitive conclusions about the relationship. Agent Starting Location Results from start locations #2 & #3 are roughly comparable, but values from location #1 are noticeably lower. Again I believe this is due to the interaction of the timing between Agent action application and the changing state of the dynamic environment XE "environment:rate of dynamic change" . Figure STYLEREF 1 \s 9 - SEQ Figure \* ARABIC \s 1 6 Mean Results by Agent Start Location Generalisation Hanks et al. (1993) suggest that we must generalise results from simplified, simulated environments to make them more applicable to complex environments. However, the weight of research evidence suggests, that the only way to achieve acceptable performance in complex environments is to add domain specific heuristic XE "heuristic" enhancements - Chapman (1987), Penberthy and Weld (1992), Gaines (1996). Generalisations made must therefore consider the effects of the heuristic specialisations on the results obtained. The results show the importance of the search XE "search:algorithm" algorithms, both from the number of searches conducted and the number of search nodes processed. Even the simplest test scenarios process many thousands of nodes to achieve the Agent goals. The search algorithm implemented for the evaluation system is quite crude and basic in its approach. As suggested by McDermott and Hendler (1995), search remains crucial, efficiencies in the search algorithms would increase the efficiencies of the planner. As complexity increases, the number of precondition XE "precondition" s in a plan that are not even addressed before execution failure rises steeply. Having consumed resources to generate a plan, the planner should endeavour to progress as much of the plan as possible. The current implementation does not do this. Efficiency could therefore be increased through the addition of failure identification and recovery functions, like those found in Haigh and Veloso's (1998) ROGUE system and McCarthy and Pollack’s (2000) Rationale Based Monitoring. Efficiency of the planner could potentially also be increased through the introduction of a belief about the future state of the environment. Both Blythe (1996) and Allen (1984) suggest that an Agent should be able to reason advantageously about alternative future states. As observed with the anomalous data mentioned above, the interaction of the time taken to execute Agent actions and the changing state of the dynamic environment increase the complexity of the planning required to achieve the Agent goals. The ability to predict those interactions and take advantage of them would increase the efficiency of the plans generated. Conclusions This chapter reviews the project as a whole and assesses whether each of the objectives set out at the start have been met. I set out to implement the UCPOP algorithm on the Sony Playstation XE "Sony Playstation" using a suitable methodology. The implementation itself is complete and I have a working planner running against a dynamic environment on the Playstation platform. The planner addresses most of the Classical assumptions that Knoblock (1996) says should be challenged. The issue of a suitable methodology proved problematic highlighting the concerns of Aylett et al. (2000) & Schreiber et al. (2000) that AI Planning is a less mature practical field than more traditional Knowledge Based Systems. I have implemented a simulated, dynamic environment that allows varied test scenarios to be constructed. It was possible to capture data from the execution of the test scenarios for later analysis. The action XE "action:non-instantaneous" s are non-instantaneous and in one case non-determinate. Whether, given the small number of objects and action XE "action:non-determinate" s, it could be described as non-trivial as Cohen et al. (1989) suggests simulated environments should be, is debatable. Whilst the environment is dynamic, it is not comparable with real-world environments. The results obtained so far appear to be in accordance with Penberthy and Weld’s (1992) assertion that UCPOP is exponential, but the results as they stand are only indicative due to the limited amount of data gathered. It should be possible to increase the number of test scenarios to gather more data and allow the application of statistical tools and techniques to provide conclusive results. Another area worth of further research would be the role of belief. The Agent built for the prototype had no concept of future events. Drawing on the suggestions of Allen (1984) and Blythe (1996) it should be possible for the Agent to advantageously reason about future events rather than blindly react to the current state of the environment. As highlighted by Russell and Norvig (1995) the plans produced by the system are very fragile. The failure of a single action XE "action:failure" results in a complete re-plan. Further research could be undertaken to improve the level of recovery in the planning process and reduce the number of un-attempted preconditions generated by the planner. The answer to the overall question of whether UCPOP performs satisfactorily against the simulated dynamic environment depends on the definition of ‘satisfactory’. Gaines (1996) states that any planner that takes exponential time, as the results suggest, is inefficient and therefore unsatisfactory. But the planner manages to recover from plan failure to achieve the goals of the Agent in all of the 27 tests against the varied dynamic environment. In that context I believe that the performance is satisfactory, although whether these results can be scaled up to be applicable to generic, complex real-world environments would require further research. Appendix A: Glossary Action An action makes changes to the environment. Certain preconditions may need to exist within the environment before the effects of the action can be applied. Application of the action effects in a dynamic environment is neither determinate nor instantaneous, as had been assumed in Classical Planning. Causal Link A link, within a plan, between an action effect and a goal precondition that it meets. Used to assess when new actions threaten the effects of actions already in place in the plan. Classical Planning Assumptions Simplifications during early AI Planning research that assumed environments were static, action application was determinate and action effects were applied instantaneously. Dynamic Event An event, that causes changes to the environment, that is not a result of the effects of Agent actions. Environment An environment contains a set of objects and actions that the Agent can use to achieve its goal state. Classical AI Planning assumes a static environment in which the only source of change is the effect of Agent actions. Dynamic environments are changed by events not initiated by Agent actions. Goal A desired state within the environment that an Agent will make plans to achieve. Ordering Identifies pairs of actions where one action must be executed before another. Not all actions will be ordered in a Partially Ordered Plan. Plan A set of actions that, when applied to an environment in a known state, will apply changes to meet Agent goals. Planner An algorithm that takes a set of agent goals, a set of applicable actions and the initial state of the environment and creates a plan. A planner is said to be sound if the plan that it returns achieves the goals, and complete if it find a solution when one exists. Precondition A state of the environment that must exist before an action can be applied or a goal can be achieved. Search Node An individual node within a search tree. Branches from the node represent the available expansion options. Search Tree Made up of nodes, a search tree is the expansion of all possible options from an initial state in search of a path to a goal state. Search trees tend to become very large, very quickly. Many approaches exist to search tree expansion. Appendix B: Design Documents Route Selection Pseudo Cope The following pseudo code selects the goal node with the lowest path cost from the initial node and copies the route using parent node links. Route Precondition Pseudo Code The following pseudo code looks along the selected route to identify any additional preconditions that need to be met before the Agent can traverse it. Process Model – Initialisation The following diagram shows the Process Model for application Initialisation. Appendix C: Implementation Documents Implementation design documentation for the following C structures is contained on the included diskette:- Agent_Struct Precondition_Struct Key_Struct Effect_Struct Exit_Struct UnmetPrecondition_Struct Door_Struct AvailablePrecons_Struct Bridge_Struct ActionEffects_Struct Route_Struct AvailableActions_Struct SearchTree_Struct CausalLinks_Struct SearchGoalNodes_Struct Orderings Condition_Struct Plan_Struct Implementation design documentation for the following C functions is contained on the included diskette:- DisplayInit() AddNodeToSearchTree() DisplayInitDrawingCycle() GenerateSearchTree() DisplayFinishDrawingCycle ExpandSearchTree() DisplayVerticalBlank() CheckAlreadyProcessed() DisplayCloseDrawingCycle() SelectRoute() DisplayClose() PreconditionEffectInit() MessageSpriteInit() BuildPlan DisplayMessage() InitPlan() NumberDisplay() GetRoutePreconditions() DisplayEnvironment() SetAgentGoal() DisplayRoute() AddUnmetPreconditions() DisplayAccessibility() AddActionEffect() ObjectInit() AddOrdering() DisplayObject() AddCausalLink() DisplayStandardText() InitExecution() DynamicEvents() GetNextActionEffect() InitSearchTree() ExecutePlan() CheckLocationForAccess() FindEffect() Appendix D: Test Scenario Definition This appendix details the configuration of the environment and objects contained within it for each value of the parameters varied during the test scenarios. Object Numbers and Attributes by Complexity The following table shows the number of BRIDGEs and DOORs at each level of complexity:- Complexity Parameter Number Bridges Number Doors 1 (High) 9 8 2 (Medium) 6 4 3 (Low) 5 0 The following table shows the location of the KEY at each level of complexity:- Complexity Parameter Key Location (X,Y) 1 (High) (9,1) 2 (Medium) (3,3) 3 (Low) (5,7) Agent Start Location The following table gives the Agent start location for each value of the Agent start location parameter:- Start Location Parameter Agent Start Location (X,Y) 1 (0,1) 2 (0,5) 3 (0,8) Configuration of the Low Complexity Environment The diagram below shows the configuration of the dynamic environment for Low Complexity scenarios:- The table below shows the initial attribute settings for the BRIDGEs in the Low Complexity scenarios:- Bridge Number Location Raised/ Lowered Change Counter Duration Between Status Change 1 (0,3) RAISED 75 150 * RDC 2 (5,4) LOWERED 20 180 * RDC 3 (2,7) RAISED 205 210 * RDC 4 (7,9) LOWERED 25 195 * RDC 5 (8,3) LOWERED 125 165 * RDC 6 - - - - 7 - - - - 8 - - - - 9 - - - - The table below shows the initial attribute settings for the DOORs in the Low Complexity scenarios:- Door Number Location Open Status Lock Status 1 - - - 2 - - - 3 - - - 4 - - - 5 - - - 6 - - - 7 - - - 8 - - - Configuration of the Medium Complexity Scenarios The diagram below shows the configuration of the dynamic environment for Medium Complexity scenarios:- The table below shows the initial attribute settings for the BRIDGEs in the Medium Complexity scenarios:- Bridge Number Location Raised/ Lowered Change Counter Duration Between Status Change 1 (3,1) RAISED 75 185 * RDC 2 (1,3) LOWERED 20 205 * RDC 3 (5,6) RAISED 205 210 * RDC 4 (6,8) LOWERED 125 165 * RDC 5 (8,2) RAISED 150 225 * RDC 6 (1,5) LOWERED 140 175 * RDC 7 - - - - 8 - - - - 9 - - - - The table below shows the initial attribute settings for the DOORs in the Medium Complexity scenarios:- Door Number Location Open Status Lock Status 1 (1,0) CLOSED UNLOCKED 2 (2,8) CLOSED UNLOCKED 3 (4,8) CLOSED LOCKED 4 (5,0) CLOSED LOCKED 5 - - - 6 - - - 7 - - - 8 - - - Configuration of the High Complexity Scenarios The diagram below shows the configuration of the dynamic environment for High Complexity scenarios:- The table below shows the initial attribute settings for the BRIDGEs in the High Complexity scenarios:- Bridge Number Location Raised/Lowered Change Counter Duration Between Status Change 1 (3,3) RAISED 75 150 * RDC 2 (2,9) LOWERED 20 180 * RDC 3 (3,7) RAISED 205 210 * RDC 4 (4,0) LOWERED 125 165 * RDC 5 (4,2) RAISED 150 225 * RDC 6 (4,6) LOWERED 90 195 * RDC 7 (6,5) RAISED 50 175 * RDC 8 (8,0) LOWERED 10 155 * RDC 9 (8,2) LOWERED 70 125 * RDC The table below shows the initial attribute settings for the DOORs in the High Complexity scenarios:- Door Number Location Open Status Lock Status 1 (1,6) CLOSED UNLOCKED 2 (2,0) CLOSED UNLOCKED 3 (3,5) CLOSED LOCKED 4 (5,7) CLOSED LOCKED 5 (5,9) CLOSED UNLOCKED 6 (6,2) CLOSED UNLOCKED 7 (8,5) CLOSED LOCKED 8 (8,8) CLOSED LOCKED Constants The following constant value exists for all test scenarios:- Exit Location = (9,4) Appendix E: Test Results Test result data averaged by the three test parameters is presented below. The full set of test results gathered are included on the accompanying diskette in the Microsoft Excel file Plan.xls. Averages By Environment Complexity The following table shows the information captured from the execution of the 27 test scenario averaged by the Environment Complexity parameter:- Averages By Rate of Dynamic Change The following table shows the information captured from the execution of the 27 test scenario averaged by the Rate of Dynamic Change parameter:- Averages By Agent Start Location The following table shows the information captured from the execution of the 27 test scenario averaged by the Agent Start Location parameter:- Appendix F: Diskette Contents The following table gives details of the files contained on the accompanying diskette. File Name File Type Description _psstart.c Text Sony supplied code that starts the bespoke application once it has finished loading on the Playstation. Appendix C – Implementation Documents.doc Microsoft Word Implementation documentation for the C structures and functions contained in the evaluation system. display.c Text GUI representation of the state of the environment and the objects within it. drawing.c Text Routines to control the Playstation display hardware. executeplan.c Text Executes an Agent plan against the dynamic environment and monitors progress. funcprototypes.h Text Header information specifying function parameters and returned information. init.c Text Initialisation routines. metric.c Text Output metrics from the execution of the test scenario against the planner. NY_Prefix.h Text Sony supplied header file containing presets for the Net Yaroze programmable Playstation. pad.h Text Sony supplied header information for interfacing with the controller pads. plan.c Text Routines to generate the plan to achieve Agent goals. plan.xls Microsoft Excel Amalgamated results obtained from the execution of the 27 test scenarios against the planner. planner.c Text Main routine that controls the plan/execute/monitor cycle and the Playstation routines to control the display hardware. planner.h Text Header information about the environment and planner. search.c Text Implementation of the search algorithms. References Agre, P.E. and Chapman, D. (1987) Pengi: An Implementation of a Theory of Activity, AAAI, Reprinted in Computation and Intelligence: Collected Papers by George F. Luger (1995) Allen, J.F. (1984) Towards a General Theory of Action and Time, Elsevier in Artificial Intelligence 23 (1984) 123-154 Ambros-Ingerson, J.A. and Steel, S. (1988) Integrating Planning, Execution and Monitoring, Automated Reasoning Aylett, R.S., Petley, G.J., Chung, P.W.H., Chen, B. and Edwards, D.W. (2000) AI Planning: Solutions for Real World Problems, Elsevier in Knowledge Based Systems 13 (2000) 61-69 Blythe, J. (1996) A Representation for Efficient Planning in Dynamic Domains with External Events, AAAI Press (1997) in Theories of Action, Planning and Robot Control: Bridging the Gap – Technical Report WS-96-07. Chapman, D. (1987) Planning for Conjunctive Goals, Elsevier in Artificial Intelligence 32 (1987) 333-377 Cohen, P.R., Greenberg, M.L., Hart, D.M. and Howe, A.E. (1989) Trial By Fire, AAAI in AI Magazine, Fall 1989 Fikes, R.E. and Nilsson, N.J. (1972) STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving, Elsevier in Artificial Intelligence, reprinted by AAAI Press in Computation and Intelligence: Collected Reading by George F. Luger (1995) Gaines, D.M. (1996) GTD-POP: Bridging the Gap Between Soundness and Efficiency in Practical Planners, Morgan Kaufmann in Theories of Action, Planning and Robot Control: Bridging the Gap – Technical Report WS-96-07. Ginsberg, M.L. (1993) Essentials of Artificial Intelligence, Morgan Kaufmann Haigh, K.Z. and Veloso, M.M. (1998) Interleaving Planning and Robot Execution for Asynchronous User Requests, Kluwer Academic Publishers in Autonomous Robots 5, 79-95 (1998) Haigh, K.Z. and Veloso, M.M. (1996) Planning for Dynamic Goals for Robot Execution, AAAI Press in Papers from the 1996 AAAI Fall Symposium – Technical Report FS-96-01. Hanks, S., Pollack, M.E. and Cohen, P.R. (1993) Benchmarks, Test Beds, Controlled Experimentation and the Design of Agent Architectures, AAAI in AI Magazine, Winter 1993. Hawkins, R., Hawley, M.C., Stricklen, J., Qiu, Y. and McDowell, J.K. (1996) Further Developments on Applying Generic Task Approaches to the Industrial Process Control Problem, AAAI Press in Plan Execution: Problems and Issues – Technical Report FS-96-01. Knoblock, C.A. (1996) Why Plan Generation and Plan Execution are Inseparable, AAAI Press in Plan Execution: Problems and Issues – Technical Report FS-96-01. McCarthy, C.E. and Pollack, M.E. (2000) Towards Focused Plan Monitoring: A Technique and an Application to Mobile Robots, Kluwer Academic Publishers in Autonomous Robots 9, 71-81, 2000 McDermott, D., Howe, A., Knoblock, C., Ram, A., Veloso, M., Weld, D. and Wilkins, D. (1998) PDDL – The Planning Domain Definition Language, Yale Center for Computational Vision and Control in Tech Report CVC TR-98-003/DCS TR-1165 McDermott, D. and Hendler, J. (1995) Planning: What is it, What is Could Be, An Introduction to the Special Issue on Planning and Scheduling, Elsevier in Artificial Intelligence 76 (1995) 1-16 Minton, S., Drummond, M., Bresina, J.L. and Philips, A.B. (1992) Total Order v Partial Order Planning: Factors Influencing Performance, Morgan Kaufmann in Principles of Knowledge Representation and Reasoning – Proceedings of the Third International Conference. Penberthy, J.S. and Weld, D.S. (1992) UCPOP: A Sound, Complete, Partial Order Planner for ADL, Morgan Kaufmann in Principles of Knowledge Representation and Reasoning – Proceedings of the Third International Conference. Pollack, M.E. and Ringuette, M. (1990) Introducing the Tileworld: Experimentally Evaluating Agent Architectures, Automated Reasoning. Punch, W.F., Decker, D.B., Stricklen, J. and McDowell, J.K. (1996) IPCA: An Architecture for Planning and Real Time Plan Execution, AAAI Press in Plan Execution: Problems and Issues – Technical Report FS-96-01. Russell, S. and Norvig, P. (1995) Artificial Intelligence: A Modern Approach, Prentice Hall Inc. Schreiber, G., Akkermans, H., Anjewierden, A., de Hoog, R., Shadbolt, N., Van de Velde, W. and Wielinga, B. (2000) Knowledge Engineering and Management: The CommonKADS Methodology, MIT Press. Tansley, D.S.W. and Hayball, C.C. (1993) Knowledge Based Systems: Analysis and Design, Prentice Hall Europe. Weld, D.S. (1999) Recent Advances in AI Planning, AAAI in AI Magazine, Summer 1999 Weld, D.S. (1994) An Introduction to Least Commitment Planning, AAAI in AI Magazine, Winter 1994 Index INDEX \e " " \c "2" \z "1033" action 10, 11, 32 effect 56 execution 39, 48, 64 failure 38, 65, 76 list of 34, 36, 51 non-determinate 33, 75 non-instantaneous 15, 19, 33, 34, 75 PLAN 47, 60 representation of 35 select next 38, 65 WAIT 47, 60 agent 27, 33 actions 34 belief 28 complete knowledge 15, 16, 28 goals 10 plan 64 state transition diagram 39 causal link 22, 25, 44, 46 metrics 49 complete knowledge 15, 16 environment 10 actions 34 complexity 48, 50, 69 dynamic 11, 12, 14, 16, 18, 19, 27, 28, 31, 56 dynamic events 38 metrics 49 objects 33 rate of dynamic change 73 real-world 11, 22 static 14 goal 10, 14, 17, 45 metrics 49 node 41, 43 preconditions 25, 46, 56 sub-goal generation 46 graphplan 18, 24, 25 heuristic 25, 29, 36, 41, 47, 65, 73 knowledge representation 10, 11, 19, 29, 35, 40 ordering 44, 59, 65, 70 metrics 49 PDDL 35, 36 Pengi 23 plan 10 algorithm 58 classical 10, 14, 15, 16, 20 complete 18 dynamic 16, 31 execution 11, 16, 17, 28, 38, 39, 55, 63, 64, 65 failure 19 generation 10, 17, 21, 38, 47, 59, 60 metrics 49 MGU 59 monitor 20, 21, 63 sound 18 planner anytime 20 classical 22 inputs 10 partial order 15, 22 propositional 18, 23 reactive 23, 71 requirements 18 total order 24 precondition 19, 22, 25, 35, 40, 44, 46, 56, 57, 59, 63, 70, 74 metrics 49 search algorithm 10, 29, 40, 48, 61, 74 depth limit 42 metrics 49 node 43, 69 space 14, 22, 24 tree 40, 41, 42, 62 Sony Playstation 11, 12, 22, 53, 75 time 10, 15, 21, 23, 28, 31, 55 UCPOP 11, 18, 24, 29, 30 algorithm 24, 43, 44 class diagram 45 MGU 57, 59 uncertainty 15, 16, 23, 24, 28 www.friendlyrobotics.com/docs/products.htm dc06.dyson.com RDC = Rate of Dynamic Change M801 CCI MSc Dissertation Brian Warner P2256553 ________________________________________________________________________ __ ________________________________________________________________________ __ Date: 22 September 2001 Page PAGE 6 of NUMPAGES 88 Version 1.4 M801 CCI MSc Dissertation Brian Warner P2256553 ________________________________________________________________________ __ ________________________________________________________________________ __ Date: 22 September 2001 Page PAGE 90 of NUMPAGES 88 Version 1.4